AITopics | sequence representation

Collaborating Authors

sequence representation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Invariant T okenization of Crystalline Materials for Language Model Enabled Generation Keqiang Y an

Neural Information Processing SystemsFeb-18-2026, 11:04:06 GMT

We consider the problem of crystal materials generation using language models (LMs).

artificial intelligence, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Brazos County > College Station (0.14)
North America > United States > Virginia (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(4 more...)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.67)
Energy (0.67)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

Hierarchical Dual-Head Model for Suicide Risk Assessment via MentalRoBERTa

Yang, Chang, Wang, Ziyi, Tan, Wangfeng, Tan, Zhiting, Ji, Changrui, Zhou, Zhiming

arXiv.org Artificial IntelligenceOct-24-2025

School of Artificial Intelligence Beijing University of Posts and T elecommunications Beijing, China ziyiwang2003@bupt.edu.cn Abstract--Social media platforms have become important sources for identifying suicide risk, but automated detection systems face multiple challenges including severe class imbalance, temporal complexity in posting patterns, and the dual nature of risk levels as both ordinal and categorical. This paper proposes a hierarchical dual-head neural network based on MentalRoBERT a for suicide risk classification into four levels: indicator, ideation, behavior, and attempt. The model employs two complementary prediction heads operating on a shared sequence representation: a CORAL (Consistent Rank Logits) head that preserves ordinal relationships between risk levels, and a standard classification head that enables flexible categorical distinctions. A 3-layer Transformer encoder with 8-head multi-head attention models temporal dependencies across post sequences, while explicit time interval embeddings capture posting behavior dynamics. The model is trained with a combined loss function (0.5 CORAL + 0.3 Cross-Entropy + 0.2 Focal Loss) that simultaneously addresses ordinal structure preservation, overconfidence reduction, and class imbalance. T o improve computational efficiency, we freeze the first 6 layers (50%) of MentalRoBERT a and employ mixed-precision training. The model is evaluated using 5-fold stratified cross-validation with macro F1 score as the primary metric.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2510.20085

Country: Asia > China > Beijing > Beijing (0.44)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

e23133d34964a0a09f6d076fc4b922a4-Paper-Conference.pdf

Neural Information Processing SystemsOct-11-2025, 00:44:27 GMT

crystal structure, sequence, sequence representation, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Brazos County > College Station (0.14)
North America > United States > Virginia (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(4 more...)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.67)
Energy (0.67)
Government > Regional Government > North America Government > United States Government (0.46)
Government > Military (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

DenseRec: Revisiting Dense Content Embeddings for Sequential Transformer-based Recommendation

Lichtenberg, Jan Malte, De Candia, Antonio, Ruffini, Matteo

arXiv.org Artificial IntelligenceAug-27-2025

Transformer-based sequential recommenders, such as SASRec or BERT4Rec, typically rely solely on learned item ID embeddings, making them vulnerable to the item cold-start problem, particularly in environments with dynamic item catalogs. While dense content embeddings from pre-trained models offer potential solutions, direct integration into transformer-based recommenders has consistently underperformed compared to ID-only approaches. We revisit this integration challenge and propose DenseRec, a simple yet effective method that introduces a dual-path embedding approach. DenseRec learns a linear projection from the dense embedding space into the ID embedding space during training, enabling seamless generalization to previously unseen items without requiring specialized embedding models or complex infrastructure. In experiments on three real-world datasets, we find DenseRec to consistently outperform an ID-only SASRec baseline, even without additional hyperparameter tuning and while using compact embedding models. Our analysis suggests improvements primarily arise from better sequence representations in the presence of unseen items, positioning DenseRec as a practical and robust solution for cold-start sequential recommendation.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2508.18442

Country: Europe > Czechia (0.16)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

La-Proteina: Atomistic Protein Generation via Partially Latent Flow Matching

Geffner, Tomas, Didi, Kieran, Cao, Zhonglin, Reidenbach, Danny, Zhang, Zuobai, Dallago, Christian, Kucukbenli, Emine, Kreis, Karsten, Vahdat, Arash

arXiv.org Artificial IntelligenceJul-15-2025

Recently, many generative models for de novo protein structure design have emerged. Yet, only few tackle the difficult task of directly generating fully atomistic structures jointly with the underlying amino acid sequence. This is challenging, for instance, because the model must reason over side chains that change in length during generation. We introduce La-Proteina for atomistic protein design based on a novel partially latent protein representation: coarse backbone structure is modeled explicitly, while sequence and atomistic details are captured via per-residue latent variables of fixed dimensionality, thereby effectively side-stepping challenges of explicit side-chain representations. Flow matching in this partially latent space then models the joint distribution over sequences and full-atom structures. La-Proteina achieves state-of-the-art performance on multiple generation benchmarks, including all-atom co-designability, diversity, and structural validity, as confirmed through detailed structural analyses and evaluations. Notably, La-Proteina also surpasses previous models in atomistic motif scaffolding performance, unlocking critical atomistic structure-conditioned protein design tasks. Moreover, La-Proteina is able to generate co-designable proteins of up to 800 residues, a regime where most baselines collapse and fail to produce valid samples, demonstrating La-Proteina's scalability and robustness.

la-proteina, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2507.09466

Country:

North America (0.28)
Europe > United Kingdom (0.28)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Education > Health & Safety > School Nutrition (0.36)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language (0.66)

Add feedback

Robust signal decompositions on the circle

Kose, Aral, Liberzon, Daniel

arXiv.org Artificial IntelligenceJul-11-2025

Imagine an agent moving along a circular path in the plane with some stationary landmarks, whose number and exact locations are unknown to the agent. Suppose that each landmark transmits an omnidirectional signal with a finite range, which we can model as a function that equals 1 inside a circular disk centered at the landmark and 0 outside. The boundaries of these disks, whose radii are in general different, may intersect the agent's path at one or two points or not at all. As the agent moves along its path, it can perceive these signals and so it knows, at each point, the number of landmarks that are within range. It cannot, however, identify different landmarks by their signals, and neither can it discern anything about each signal's strength other than its presence or absence. The agent's knowledge of its position on the circle may also not be precise, and the signal transmissions or measurements may occur with some sampling frequency rather than continuously in time. For these reasons, all that the agent can reliably reconstruct is a sequence of nonnegative integers corresponding to local landmark counts around the circle, and it may not be sure of the precise count at the exact points where this count changes. In this scenario, we want to pose the following questions: Can the agent figure out the total number of landmarks (excluding, of course, those whose signals do not reach any points on the circle)?

artificial intelligence, decomposition, signal function, (17 more...)

arXiv.org Artificial Intelligence

2507.07007

Country: North America > United States > Illinois (0.28)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.46)

Add feedback

Invariant Tokenization of Crystalline Materials for Language Model Enabled Generation

Yan, Keqiang, Li, Xiner, Ling, Hongyi, Ashen, Kenna, Edwards, Carl, Arróyave, Raymundo, Zitnik, Marinka, Ji, Heng, Qian, Xiaofeng, Qian, Xiaoning, Ji, Shuiwang

arXiv.org Artificial IntelligenceFeb-28-2025

We consider the problem of crystal materials generation using language models (LMs). A key step is to convert 3D crystal structures into 1D sequences to be processed by LMs. Prior studies used the crystallographic information framework (CIF) file stream, which fails to ensure SE(3) and periodic invariance and may not lead to unique sequence representations for a given crystal structure. Here, we propose a novel method, known as Mat2Seq, to tackle this challenge. Mat2Seq converts 3D crystal structures into 1D sequences and ensures that different mathematical descriptions of the same crystal are represented in a single unique sequence, thereby provably achieving SE(3) and periodic invariance. Experimental results show that, with language models, Mat2Seq achieves promising performance in crystal structure generation as compared with prior methods.

crystal structure, sequence, sequence representation, (17 more...)

arXiv.org Artificial Intelligence

2503.00152

Country:

North America > United States > Texas > Brazos County > College Station (0.14)
North America > United States > Virginia (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(4 more...)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.67)
Energy (0.67)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

EvoLlama: Enhancing LLMs' Understanding of Proteins via Multimodal Structure and Sequence Representations

Liu, Nuowei, Sun, Changzhi, Ji, Tao, Tian, Junfeng, Tang, Jianxin, Wu, Yuanbin, Lan, Man

arXiv.org Artificial IntelligenceDec-16-2024

Current Large Language Models (LLMs) for understanding proteins primarily treats amino acid sequences as a text modality. Meanwhile, Protein Language Models (PLMs), such as ESM-2, have learned massive sequential evolutionary knowledge from the universe of natural protein sequences. Furthermore, structure-based encoders like ProteinMPNN learn the structural information of proteins through Graph Neural Networks. However, whether the incorporation of protein encoders can enhance the protein understanding of LLMs has not been explored. To bridge this gap, we propose EvoLlama, a multimodal framework that connects a structure-based encoder, a sequence-based protein encoder and an LLM for protein understanding. EvoLlama consists of a ProteinMPNN structure encoder, an ESM-2 protein sequence encoder, a multimodal projector to align protein and text representations and a Llama-3 text decoder. To train EvoLlama, we fine-tune it on protein-oriented instructions and protein property prediction datasets verbalized via natural language instruction templates. Our experiments show that EvoLlama's protein understanding capabilities have been significantly enhanced, outperforming other fine-tuned protein-oriented LLMs in zero-shot settings by an average of 1%-8% and surpassing the state-of-the-art baseline with supervised fine-tuning by an average of 6%. On protein property prediction datasets, our approach achieves promising results that are competitive with state-of-the-art task-specific baselines. We will release our code in a future version.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2412.11618

Country:

Asia > China (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Asia > Thailand > Bangkok > Bangkok (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Contrastive Representation Learning for Predicting Solar Flares from Extremely Imbalanced Multivariate Time Series Data

Vural, Onur, Hamdi, Shah Muhammad, Boubrahimi, Soukaina Filali

arXiv.org Artificial IntelligenceSep-30-2024

Major solar flares are abrupt surges in the Sun's magnetic flux, presenting significant risks to technological infrastructure. In view of this, effectively predicting major flares from solar active region magnetic field data through machine learning methods becomes highly important in space weather research. Magnetic field data can be represented in multivariate time series modality where the data displays an extreme class imbalance due to the rarity of major flare events. In time series classification-based flare prediction, the use of contrastive representation learning methods has been relatively limited. In this paper, we introduce CONTREX, a novel contrastive representation learning approach for multivariate time series data, addressing challenges of temporal dependencies and extreme class imbalance. Our method involves extracting dynamic features from the multivariate time series instances, deriving two extremes from positive and negative class feature vectors that provide maximum separation capability, and training a sequence representation embedding module with the original multivariate time series data guided by our novel contrastive reconstruction loss to generate embeddings aligned with the extreme points. These embeddings capture essential time series characteristics and enhance discriminative power. Our approach shows promising solar flare prediction results on the Space Weather Analytics for Solar Flares (SWAN-SF) multivariate time series benchmark dataset against baseline methods.

prediction, representation, time sery, (15 more...)

arXiv.org Artificial Intelligence

2410.00312

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > Utah > Cache County > Logan (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
(2 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.74)

Add feedback

Spatial-Temporal Cross-View Contrastive Pre-training for Check-in Sequence Representation Learning

Gong, Letian, Wan, Huaiyu, Guo, Shengnan, Li, Xiucheng, Lin, Yan, Zheng, Erwen, Wang, Tianyi, Zhou, Zeyu, Lin, Youfang

arXiv.org Artificial IntelligenceJul-25-2024

The rapid growth of location-based services (LBS) has yielded massive amounts of data on human mobility. Effectively extracting meaningful representations for user-generated check-in sequences is pivotal for facilitating various downstream services. However, the user-generated check-in data are simultaneously influenced by the surrounding objective circumstances and the user's subjective intention. Specifically, the temporal uncertainty and spatial diversity exhibited in check-in data make it difficult to capture the macroscopic spatial-temporal patterns of users and to understand the semantics of user mobility activities. Furthermore, the distinct characteristics of the temporal and spatial information in check-in sequences call for an effective fusion method to incorporate these two types of information. In this paper, we propose a novel Spatial-Temporal Cross-view Contrastive Representation (STCCR) framework for check-in sequence representation learning. Specifically, STCCR addresses the above challenges by employing self-supervision from "spatial topic" and "temporal intention" views, facilitating effective fusion of spatial and temporal information at the semantic level. Besides, STCCR leverages contrastive clustering to uncover users' shared spatial topics from diverse mobility activities, while employing angular momentum contrast to mitigate the impact of temporal uncertainty and noise. We extensively evaluate STCCR on three real-world datasets and demonstrate its superior performance across three downstream tasks.

check-in sequence, representation, sequence, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TKDE.2024.3434565

2407.15899

Country:

Asia > China > Beijing > Beijing (0.06)
North America > United States > New York > New York County > New York City (0.04)
Asia > China > Heilongjiang Province > Harbin (0.04)
(4 more...)

Genre: Research Report > New Finding (0.67)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback